Dictation API
Overview
This project provides an automatic speech recognition (ASR) system specifically designed for the Irish language. Built on the Vosk speech recognition toolkit, it enables real-time transcription of spoken Irish through a web interface. The system utilizes WebRTC technology to capture audio from users' browsers and process it on a server running a Kaldi-based speech recognition model trained for Irish.
Deployment
Location
The currently running dictation vosk server is in /home/mrozo/dictation (23/07/2025)
deploy_dictation_server.sh
There is a script in the project folder that automates updating the server in production. It does the follwing:
- Pulls latest changes
- Builds a new docker image
- Deploys the new docker iamge if the build is successful
- Logs the initial server outputs to verify if it is running.
In the future, this should be replaced with the ABAIR CI/CD Pipeline
#!/bin/bash
set -e # exit on error
echo "Pulling latest changes from Git..."
if ! git pull; then
echo "Git pull failed, aborting."
exit 1
fi
echo "Git pull successful."
echo "Building Docker image: dictation-asr-server"
sudo docker buildx build -t dictation-asr-server .
echo "Starting containers with docker compose"
sudo docker compose up -d
echo "Finding container ID for dictation-asr-server"
# adjust this filter if your container name differs
CONTAINER_ID=$(sudo docker ps -qf "ancestor=dictation-asr-server")
if [ -z "$CONTAINER_ID" ]; then
echo "Could not find a running container for dictation-asr-server"
exit 1
fi
echo "Tailing logs for container: $CONTAINER_ID"
sudo docker logs -f "$CONTAINER_ID"
Dockerfile
The project contains the following Dockerfile to generate an iamge
# Use an official Python runtime as a parent image
FROM python:3.9-slim
# Set the working directory in the container
WORKDIR /usr/src/app
# Install libatomic1 and other dependencies
RUN apt-get update && apt-get install -y \
libatomic1 \
&& rm -rf /var/lib/apt/lists/*
# Copy the current directory contents into the container at /usr/src/app
COPY . .
# Install any needed packages specified in requirements.txt
# Note: Make sure you have a requirements.txt file with all the dependencies.
RUN pip install --no-cache-dir -r requirements.txt
# Make port 2700 available to the world outside this container
EXPOSE 2700
# Define environment variable
ENV VOSK_MODEL_PATH=/usr/src/app/model
# Copy the static directory
COPY static /usr/src/app/static
# Run asr_server.py when the container launches
CMD ["python", "./asr_server.py"]
docker-compose.yaml
As of writing this documentation, the project is using a docker compose on the server which will reload the latest image.
version: '3.8'
services:
asr-server:
# Build the image from the current directory (requires a Dockerfile here)
build: .
container_name: asr-server
hostname: asr-server
# Expose the ASR server port (adjust if you use a different port)
ports:
- "2700:2700"
environment:
# Bind to all interfaces inside the container
VOSK_SERVER_INTERFACE: "0.0.0.0"
# The port your ASR server listens on (default 2700)
VOSK_SERVER_PORT: "2700"
# Path to your Vosk model directory inside the container
VOSK_MODEL_PATH: "/usr/src/app/model"
# Optional: where to dump raw audio (omit or set to empty to disable)
VOSK_DUMP_FILE: "/app/dump/audio.raw"
volumes:
# Mount your local model directory (read-only)
- ./model:/app/model:ro
# Optional: mount a host directory for audio dumps
- ./dump:/app/dump:rw
restart: unless-stopped
Notes
- As of 23/07/2025, the server is not connected to the CI/CD Pipeline